The ML Observability team is committed to empowering our customers with an advanced observability platform, specifically designed for applications that increasingly integrate machine learning components such as large language models and generative AI. We provide comprehensive monitoring and diagnostics for ML-based components, tracking model performance, drift, fairness, and system stability. Our platform also offers model prediction explainability and root-cause analysis, enhancing organizations' confidence in the reliability of their deployments. You can learn more about our ML/LLM Observability solution here.
As the Engineering Manager, you will lead a team focused on enhancing and expanding Datadog's ML Observability product. Positioned at the forefront of R&D, you will emphasize rigor and experimentation to design, refine, and implement advanced techniques for evaluating and monitoring AI components - LLMs in particular - in our customers' applications. Your leadership and expertise in both engineering and applied science will be pivotal in shaping the direction of our product, ensuring Datadog remains a key player in this rapidly evolving field.
What we're doing with ML Observability:
DASH Keynote - https://lnkd.in/edq4mhdv
Datadog Blog on ML Observability - https://www.datadoghq.com/blog/datadog-llm-observability/
At Datadog, we place value in our office culture - the relationships and collaboration it builds and the creativity it brings to the table. We operate as a hybrid workplace to ensure our Datadogs can create a work-life harmony that best fits them.
What You’ll Do:
- Manage and mentor a team of engineers, fostering a collaborative and innovative work environment
- Leverage your technical expertise in software engineering and applied science to guide the team in building robust and scalable solutions
- Apply your experience with LLMs to enhance the product's capabilities in evaluating and monitoring LLM-based applications
- Explore and implement new techniques and tools to provide deeper insights into model behavior, drift, fairness, and interpretability
- Engage with senior management and executives, articulating complex technical concepts clearly and precisely
- Stay current with industry trends and advancements in machine learning and observability, driving innovation within the team
Who You Are:
- Proven experience in software engineering and applied science, with a focus on engineering LLM-based systems in production
- Demonstrated experience managing small teams of software engineers and/or applied scientists, with a track record of delivering high-quality products
- Strong software development skills and proficiency in Python and Go
- Strong understanding of machine learning theory, statistics, and fundamentals
- Excellent communication abilities to convey complex technical concepts clearly
- A collaborative mindset and proven experience in working in cross-functional teams
- A proactive approach with a passion for continuous learning and innovation
Salary: $187,000 - $240,000
Location: Boston, MA; New York, New York
Remote: n/a
Company: Datadog
Posted: 2025-02-22
Job Closed